{
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "UMqBL77hMXP2"
      },
      "source": [
        "# Exploring and employeeing NER and POS codes\n",
        "Authors:  \n",
        " - [Lior Gazit](https://www.linkedin.com/in/liorgazit).  \n",
        " - [Meysam Ghaffari](https://www.linkedin.com/in/meysam-ghaffari-ph-d-a2553088/).  \n",
        "\n",
        "This notebook is taught and reviewed in our book:  \n",
        "**[Mastering NLP from Foundations to LLMs](https://www.amazon.com/dp/1804619183)**  \n",
        "![image.png]()\n",
        "\n",
        "This Colab notebook is referenced in our book's Github repo:   \n",
        "https://github.com/PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs   \n",
        "<a target=\"_blank\" href=\"https://colab.research.google.com/github/PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs/blob/liors_branch/Chapter4_notebooks/Ch4_NER_and_POS.ipynb\">\n",
        "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
        "</a>"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "dO3hrUVTRoMN"
      },
      "source": [
        "**The purpose of this notebook:**  \n",
        "In this notebook we review the methods of:  \n",
        "* **Named Entity Recognition**  \n",
        "* **Part of Speech**  \n",
        "As we demonstrate in Chapter 4 of the book, these methods are some of the most common and basic methods around analyzing and processing free natural text.  \n",
        "\n",
        "**Requirements:**  \n",
        "* When running in Colab, use this runtime notebook setting: `Python 3, CPU`  "
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "g54Uf66Vz9Fi"
      },
      "source": [
        ">*```Disclaimer: The content and ideas presented in this notebook are solely those of the authors and do not represent the views or intellectual property of the authors' employers.```*"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "4sRUNE9O26Ls"
      },
      "source": [
        "Install:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "id": "_W3Gq84_26ZV"
      },
      "outputs": [],
      "source": [
        "# REMARK:\n",
        "# If the below code error's out due to a Python package discrepency, it may be because new versions are causing it.\n",
        "# In which case, set \"default_installations\" to False to revert to the original image:\n",
        "default_installations = True\n",
        "\n",
        "if not default_installations:\n",
        "  import requests\n",
        "  text_file_path = \"requirements__Ch4_NER_and_POS.txt\"\n",
        "  url = \"https://raw.githubusercontent.com/PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs/main/Chapter4_notebooks/\" + text_file_path\n",
        "  res = requests.get(url)\n",
        "  with open(text_file_path, \"w\") as f:\n",
        "    f.write(res.text)\n",
        "\n",
        "  !pip install -r requirements__Ch4_NER_and_POS.txt"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "eGIieXfTRvWT"
      },
      "source": [
        "**Imports:**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "id": "fEgRDtdmPjnY"
      },
      "outputs": [],
      "source": [
        "import spacy\n",
        "from spacy import displacy"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "hI6eiQuURz0p"
      },
      "source": [
        "## NER:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "id": "WP8CysUCPnm6"
      },
      "outputs": [],
      "source": [
        "def extract_companies(input):\n",
        "  ner = spacy.load(\"en_core_web_sm\")\n",
        "  extractions = ner(input)\n",
        "  displacy.render(extractions,style=\"ent\", jupyter=True)\n",
        "  return [item.text for item in extractions.ents if item.label_ == \"ORG\"]\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 104
        },
        "id": "tiTcYnXlPqMS",
        "outputId": "b944eff5-de9d-447e-9722-c32d3f1940f5"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\">The companies that would be releasing their \n",
              "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    quarterly\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
              "</mark>\n",
              " reports \n",
              "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    tomorrow\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
              "</mark>\n",
              " are \n",
              "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    Microsoft\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
              "</mark>\n",
              ", \n",
              "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    4pm\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">TIME</span>\n",
              "</mark>\n",
              ", \n",
              "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    Google\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
              "</mark>\n",
              ", \n",
              "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    4pm\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">TIME</span>\n",
              "</mark>\n",
              ", and \n",
              "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    AT&amp;T\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
              "</mark>\n",
              ", \n",
              "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
              "    6pm\n",
              "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; vertical-align: middle; margin-left: 0.5rem\">TIME</span>\n",
              "</mark>\n",
              ".</div></span>"
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "\n",
            "The definition of the label 'ORG': Companies, agencies, institutions, etc.\n",
            "Companies: ['Microsoft', 'Google', 'AT&T']\n"
          ]
        }
      ],
      "source": [
        "text = \"The companies that would be releasing their quarterly reports tomorrow are Microsoft, 4pm, Google, 4pm, and AT&T, 6pm.\"\n",
        "companies = extract_companies(text)\n",
        "print(\"\\nThe definition of the label 'ORG': \" + spacy.explain(\"ORG\"))\n",
        "print(\"Companies:\", companies)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "1NXAa4LTR5av"
      },
      "source": [
        "## POS:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "id": "e4aZ7wbHoZlB"
      },
      "outputs": [],
      "source": [
        "from pathlib import Path"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {
        "id": "6m-Avz85QRp5"
      },
      "outputs": [],
      "source": [
        "def extract_pos(input):\n",
        "  ner = spacy.load(\"en_core_web_sm\")\n",
        "  extractions = ner(input)\n",
        "  displacy.render(extractions, style='dep', jupyter=True, options={'compact': True, 'distance': 100})\n",
        "  return [[item.text, item.pos_] for item in extractions if item.pos_ in [\"NOUN\", \"VERB\", \"ADJ\", \"PROPN\"]]\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 620
        },
        "id": "Y0N03rFjSsO_",
        "outputId": "a6501cc7-338a-44c1-a73a-dee1d8bf1371"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<span class=\"tex2jax_ignore\"><svg xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\" xml:lang=\"en\" id=\"b192c86b922945e695777a8747101b13-0\" class=\"displacy\" width=\"2150\" height=\"387.0\" direction=\"ltr\" style=\"max-width: none; height: 387.0px; color: #000000; background: #ffffff; font-family: Arial; direction: ltr\">\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"50\">The</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"50\">DET</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"150\">companies</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"150\">NOUN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"250\">that</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"250\">PRON</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"350\">would</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"350\">AUX</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"450\">be</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"450\">AUX</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"550\">releasing</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"550\">VERB</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"650\">their</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"650\">PRON</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"750\">quarterly</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"750\">ADJ</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"850\">reports</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"850\">NOUN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"950\">tomorrow</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"950\">NOUN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1050\">are</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1050\">AUX</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1150\">Microsoft,</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1150\">PROPN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1250\">4</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1250\">NUM</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1350\">pm,</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1350\">NOUN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1450\">Google,</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1450\">PROPN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1550\">4</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1550\">NUM</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1650\">pm,</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1650\">NOUN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1750\">and</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1750\">CCONJ</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1850\">AT&amp;T,</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1850\">PROPN</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"1950\">6</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"1950\">NUM</tspan>\n",
              "</text>\n",
              "\n",
              "<text class=\"displacy-token\" fill=\"currentColor\" text-anchor=\"middle\" y=\"297.0\">\n",
              "    <tspan class=\"displacy-word\" fill=\"currentColor\" x=\"2050\">pm.</tspan>\n",
              "    <tspan class=\"displacy-tag\" dy=\"2em\" fill=\"currentColor\" x=\"2050\">NOUN</tspan>\n",
              "</text>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-0\" stroke-width=\"2px\" d=\"M62,252.0 62,235.33333333333334 138.0,235.33333333333334 138.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-0\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">det</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M62,254.0 L58,246.0 66,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-1\" stroke-width=\"2px\" d=\"M162,252.0 162,168.66666666666669 1050.0,168.66666666666669 1050.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-1\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nsubj</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M162,254.0 L158,246.0 166,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-2\" stroke-width=\"2px\" d=\"M262,252.0 262,202.0 544.0,202.0 544.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-2\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nsubj</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M262,254.0 L258,246.0 266,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-3\" stroke-width=\"2px\" d=\"M362,252.0 362,218.66666666666666 541.0,218.66666666666666 541.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-3\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">aux</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M362,254.0 L358,246.0 366,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-4\" stroke-width=\"2px\" d=\"M462,252.0 462,235.33333333333334 538.0,235.33333333333334 538.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-4\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">aux</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M462,254.0 L458,246.0 466,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-5\" stroke-width=\"2px\" d=\"M162,252.0 162,185.33333333333331 547.0,185.33333333333331 547.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-5\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">relcl</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M547.0,254.0 L551.0,246.0 543.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-6\" stroke-width=\"2px\" d=\"M662,252.0 662,218.66666666666666 841.0,218.66666666666666 841.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-6\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">poss</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M662,254.0 L658,246.0 666,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-7\" stroke-width=\"2px\" d=\"M762,252.0 762,235.33333333333334 838.0,235.33333333333334 838.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-7\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">amod</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M762,254.0 L758,246.0 766,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-8\" stroke-width=\"2px\" d=\"M562,252.0 562,202.0 844.0,202.0 844.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-8\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">dobj</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M844.0,254.0 L848.0,246.0 840.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-9\" stroke-width=\"2px\" d=\"M562,252.0 562,185.33333333333331 947.0,185.33333333333331 947.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-9\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">npadvmod</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M947.0,254.0 L951.0,246.0 943.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-10\" stroke-width=\"2px\" d=\"M1062,252.0 1062,235.33333333333334 1138.0,235.33333333333334 1138.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-10\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">attr</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1138.0,254.0 L1142.0,246.0 1134.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-11\" stroke-width=\"2px\" d=\"M1262,252.0 1262,235.33333333333334 1338.0,235.33333333333334 1338.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-11\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nummod</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1262,254.0 L1258,246.0 1266,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-12\" stroke-width=\"2px\" d=\"M1162,252.0 1162,218.66666666666666 1341.0,218.66666666666666 1341.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-12\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">appos</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1341.0,254.0 L1345.0,246.0 1337.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-13\" stroke-width=\"2px\" d=\"M1362,252.0 1362,235.33333333333334 1438.0,235.33333333333334 1438.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-13\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">npadvmod</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1438.0,254.0 L1442.0,246.0 1434.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-14\" stroke-width=\"2px\" d=\"M1562,252.0 1562,235.33333333333334 1638.0,235.33333333333334 1638.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-14\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nummod</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1562,254.0 L1558,246.0 1566,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-15\" stroke-width=\"2px\" d=\"M1362,252.0 1362,218.66666666666666 1641.0,218.66666666666666 1641.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-15\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">conj</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1641.0,254.0 L1645.0,246.0 1637.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-16\" stroke-width=\"2px\" d=\"M1362,252.0 1362,202.0 1744.0,202.0 1744.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-16\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">cc</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1744.0,254.0 L1748.0,246.0 1740.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-17\" stroke-width=\"2px\" d=\"M1362,252.0 1362,185.33333333333331 1847.0,185.33333333333331 1847.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-17\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">conj</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1847.0,254.0 L1851.0,246.0 1843.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-18\" stroke-width=\"2px\" d=\"M1962,252.0 1962,235.33333333333334 2038.0,235.33333333333334 2038.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-18\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">nummod</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M1962,254.0 L1958,246.0 1966,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "\n",
              "<g class=\"displacy-arrow\">\n",
              "    <path class=\"displacy-arc\" id=\"arrow-b192c86b922945e695777a8747101b13-0-19\" stroke-width=\"2px\" d=\"M1862,252.0 1862,218.66666666666666 2041.0,218.66666666666666 2041.0,252.0\" fill=\"none\" stroke=\"currentColor\"/>\n",
              "    <text dy=\"1.25em\" style=\"font-size: 0.8em; letter-spacing: 1px\">\n",
              "        <textPath xlink:href=\"#arrow-b192c86b922945e695777a8747101b13-0-19\" class=\"displacy-label\" startOffset=\"50%\" side=\"left\" fill=\"currentColor\" text-anchor=\"middle\">appos</textPath>\n",
              "    </text>\n",
              "    <path class=\"displacy-arrowhead\" d=\"M2041.0,254.0 L2045.0,246.0 2037.0,246.0\" fill=\"currentColor\"/>\n",
              "</g>\n",
              "</svg></span>"
            ],
            "text/plain": [
              "<IPython.core.display.HTML object>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "text/plain": [
              "[['companies', 'NOUN'],\n",
              " ['releasing', 'VERB'],\n",
              " ['quarterly', 'ADJ'],\n",
              " ['reports', 'NOUN'],\n",
              " ['tomorrow', 'NOUN'],\n",
              " ['Microsoft', 'PROPN'],\n",
              " ['pm', 'NOUN'],\n",
              " ['Google', 'PROPN'],\n",
              " ['pm', 'NOUN'],\n",
              " ['AT&T', 'PROPN'],\n",
              " ['pm', 'NOUN']]"
            ]
          },
          "execution_count": 16,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "extract_pos(text)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {
        "id": "hmJ6Y6rl0aJw"
      },
      "outputs": [],
      "source": []
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
